Bilingual Random Walk Models for Automated Grammar Correction of ESL Author-Produced Text
نویسندگان
چکیده
We present a novel noisy channel model for correcting text produced by English as a second language (ESL) authors. We model the English word choices made by ESL authors as a random walk across an undirected bipartite dictionary graph composed of edges between English words and associated words in an author’s native language. We present two such models, using cascades of weighted finitestate transducers (wFSTs) to model language model priors, random walk-induced noise, and observed sentences, and expectation maximization (EM) to learn model parameters after Park and Levy (2011). We show that such models can make intelligent word substitutions to improve grammaticality in an unsupervised setting.
منابع مشابه
Evaluating the Quality of Web-Mined Bilingual Sentence Pairs
We come up with the problem of evaluating the quality of bilingual sentence pairs mined from the web, which is critical for a wide range of applications such as statistical machine translation (SMT) and English as Second Language (ESL) learning. To address this problem, we propose a novel method that integrates multiple linguistic features related to spelling, grammar, alignment, and particular...
متن کاملUsing Error-Annotated ESL Data to Develop an ESL Error Correction System
This paper presents research on building a model of grammatical error correction, for preposition errors in particular, in English text produced by language learners. Unlike most previous work which trains a statistical classifier exclusively on well-formed text written by native speakers, we train a classifier on a large-scale, error-tagged corpus of English essays, relying on contextual and g...
متن کاملHow Does Focus on Form Affect the Revising Processes of ESL Writers?: Two Case Studies
This study considers the ongoing “grammar correction debate” in second language writing by examining how a focus on formal accuracy would affect the revising processes of ESL writers and the students’ written products. A case study approach was used to find out how two ESL students would respond in the two different rewriting situations: (a) when there is no explicit expectation for them to pro...
متن کاملAutomated Grammar Correction Using Hierarchical Phrase-Based Statistical Machine Translation
We introduce a novel technique that uses hierarchical phrase-based statistical machine translation (SMT) for grammar correction. SMT systems provide a uniform platform for any sequence transformation task. Thus grammar correction can be considered a translation problem from incorrect text to correct text. Over the years, grammar correction data in the electronic form (i.e., parallel corpora of ...
متن کاملAutomated Grammar Checking of Tenses for ESL Writing
Various word-processing system have been developed to identify grammatical errors and mark learners’ essays. However, they are not specifically developed for Malaysian ESL (English as a second language) learners. A marking tool which is capable to identify errors in ESL writing for these learners is very much needed. Though there are numerous techniques adopted in grammar checking and automated...
متن کامل